Simple Panel Data Approaches with Binary Treatment
Reference articles:
Latter two articles more advanced and modern — how people do it these days.
Also see NBER 2023 Methods Lectures on Linear Panel Event Studies: https://www.nber.org/conferences/si-2023-methods-lectures-linear-panel-event-studies
Begin with the simplest possible panel setting with binary treatment:
Object of interest: “average effect of treatment”
Simplest approach: compute average change in \(Y_{it}\) across periods \[ \widehat{AE}_{ES} = \dfrac{1}{N}\sum_{i=1}^N (Y_{i2}- Y_{i1}). \tag{1}\]
Estimator (1) — simplest example of event study estimators (see Freyaldenhoven et al. 2021; Miller 2023).
Possible empirical framework
Effect of interest: change in stock prices due to the announcement of iPhone
Proposition 1 (Asymptotics for \(\widehat{AE}_{ES}\)) Let
Then \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i2} - Y_{i1}]. \]
Is \(\E[Y_{i2} - Y_{i1}]\) interesting (=causal)?
Need a causal framework to talk about causal effects!
Work in the familiar potential outcomes framework:
For short, use \(Y_{it}^d\) where \(d=0, 1\)
Potential and realized outcomes are connected as \[ Y_{i2} = Y_{i2}^1, \quad Y_{i1} = Y_{i1}^0. \]
It follows that \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i2}^2- Y_{i1}^1]. \]
\(\E[Y_{i2}^2- Y_{i1}^1]\) is not necessarily a treatment effect — mixes effect of treatment and effects of time!
Context Again consider the iPhone example. Then
We see combination of both changes \[ Y_{i2} - Y_{i1} = Y_{i2}^1- Y_{i1}^0 = [Y_{i2}^1- Y_{i2}^0] + [Y_{i2}^0 - Y_{i0}^0] \]
Simple solution: rule out changes over time
Assumption: no variation in potential outcomes \[ Y_{i2}^d= Y_{i1}^d, \quad d=0, 1 \]
Then \(\widehat{AE}_{ES}\) is estimating a causal parameter — average effects \[ \begin{aligned} \widehat{AE}_{ES} & \xrightarrow{p} \E[Y_{i1}^1- Y_{i1}^0] = \E[Y_{i2}^1- Y_{i2}^0] \end{aligned} \]
Time invariance very strict. Why use it if we only work with averages?
Weaker assumption:
Assumption (no trends):
\[
\E[Y_{i2}^d] = \E[Y_{i1}^d], \quad d=0, 1
\]
Allows random variation in potential outcomes between times
Proposition 2 (Causal asymptotics for \(\widehat{AE}_{ES}\)) Let
Then \(\widehat{AE}_{ES}\) consistent for causal parameters: \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i1}^1- Y_{i1}^0] = \E[Y_{i2}^1- Y_{i2}^0] \]
Can also connect \(\widehat{AE}_{ES}\) and OLS
Consider regression model \[ \begin{aligned} Y_{it} & = \beta_0 + \beta_1 D_{it} + u_{it}, \\ D_{it} & = \begin{cases} 1, & t= 1 \\ 0, & t =0 \end{cases} \end{aligned} \tag{2}\] where we simply treat \((Y_{i1}, D_{i1})\) and \((Y_{i2}, D_{i2})\) as separate observations
Proposition 3 (\(\widehat{AE}_{ES}\) is OLS) For \(\beta_1\) of regression (2) \[ \widehat{AE}_{ES} = \hat{\beta}_1^{OLS} \]
A way to think about regression in causal settings:
Write down the regression in terms of parameters of interest: e.g. let \[ \beta_0 = \E[Y_{i0}^0], \quad \beta_1 = \E[Y_{i2}^1- Y_{i2}^0] \]
Connect regression to potential outcomes: what is \(u_{it}\) in terms of potential outcomes?
Check properties of this \(u_{it}\). If \(u_{it}\) is “nice”, apply OLS (or another method)
New framework:
New variables for treatment: \[ D_{it, \tau} = \begin{cases} 1, & t= \tau, \\ 0, & t\neq \tau \end{cases} \]
Can try similar regression: \[ Y_{it} = \beta_0 + \sum_{\tau = t_0}^{T} \beta_\tau D_{it, \tau} + u_{it}. \tag{3}\]
Fairly easy to show that \[ \begin{aligned} \hat{\beta}^{OLS}_{\tau} & = \dfrac{1}{N} \sum_{i=1}^N Y_{i\tau} - \dfrac{1}{N(t_0-1)} \sum_{i=1}^N\left[ Y_{i1} + \dots + Y_{it_0-1} \right] \\ & \xrightarrow{p} \E\left[Y_{i\tau}^1 - \dfrac{1}{t_0-1}(Y_{i1}^0+ \dots + Y_{it_0-1}^0) \right] \end{aligned} \] More general version of the simple estimator of before
If \(\beta_{\tau}\) — average effect in period \(\tau\), then model (3) seems to allow for dynamic effects
Dynamic effects often realistic: effect of treatment may grow or disappear over time. Example: impact of job training on earnings:
Suppose that the no trends assumption holds \[ \begin{aligned} & \E\left[Y_{i\tau}^1 - \dfrac{1}{t_0-1}(Y_{i1}^0+ \dots + Y_{it_0-1}^0) \right] = \E[Y_{it}^1 - Y_{it}^0] \end{aligned} \] where right hand side does not depend on \(t\)
The no trends assumption rules out dynamic treatment effects
This is bad, since we want to allow for dynamics
Can relax no trends to only affect one of the potential outcomes — makes more sense to restrict the untreated outcome (why?)
Assumption (no trends in the baseline): \(\E[Y_{it}^0]\) does not depend on \(t\)
Under the no trends in the baseline assumption: \[ \begin{aligned} & \E\left[Y_{i\tau}^1 - \dfrac{1}{t_01}(Y_{i1}^0+ \dots + Y_{it_0-1}^0) \right] = \E[Y_{i \tau}^1 - Y_{i\tau}^0] \end{aligned} \] Right hand side is average effect in period \(\tau\)
Proposition 4 Let
Then \(\hat{\beta}_{\tau}\) consistently estimates the average treatment effect in period \(\tau\) given by \(\E[Y_{i \tau}^1 - Y_{i\tau}^0]\).
Moreover, under no trends in the baseline \[ \E[\hat{\beta}_{\tau}] = \E[Y_{i \tau}^1 - Y_{i\tau}^0] \] In other words, the OLS estimator is unbiased
Proposition 5 Let
Then there exists some variance \(V\) such that \[ \sqrt{N}\left( \hat{\beta}_{\tau} - \E[Y_{i \tau}^1 - Y_{i\tau}^0] \right) \Rightarrow N(0, V) \]
Need variance \(V\) to construct confidence intervals for \(\E[Y_{i \tau}^1 - Y_{i\tau}^0]\). How to find \(V\)? Remember the CLT:
If \(X_1, X_2, \dots\) are IID random variables with \(\E[X_i]=\mu\) and \(\E[X^2]<\infty\), then \[ \sqrt{N}\left( \dfrac{1}{N}\sum_{i=1}^N X_i - \mu \right)\Rightarrow N\left(0, \var(X_i) \right) \] Asymptotic variance = variance of \(X_i\)
Can apply this idea to find \(V\). Can write \[ \begin{aligned} \hat{\beta}^{OLS}_{\tau} & = \dfrac{1}{N} \sum_{i=1}^N Z_i\\ Z_i & = Y_{i\tau} - \dfrac{1}{(t_0-1)}\left[ Y_{i1} + \dots + Y_{it_0-1} \right] \end{aligned} \]
Then we get that \[ V = \var(Z_i) \]
Can estimate \(V\) with \[ \hat{V} = \widehat{\var}(Z_i) = \dfrac{1}{N}\left(Z_i - \dfrac{1}{N}\sum_{j=1}^N Z_j \right)^2 \]
Estimated standard error of \(\hat{\beta}_{\tau}\): \[ \widehat{se}(\hat{\beta}_{\tau}) = \sqrt{ \dfrac{\hat{V}}{N} } \]
Can now construct confidence intervals and hypothesis tests about \(\E[Y_{i \tau}^1 - Y_{i\tau}^0]\). E.g. an asymptotic 95% confidece interval: \[ \widehat{CI}_{95\%} = \left[\hat{\beta}^{OLS}_{\tau} - z_{1-\alpha/2}\widehat{se}(\hat{\beta}_{\tau}), \hat{\beta}^{OLS}_{\tau} + z_{1-\alpha/2}\widehat{se}(\hat{\beta}_{\tau}) \right] \] where the critical values \(z_{1-\alpha/2}\) come from the standard normal distribution \[ z_{1-\alpha/2}= \Phi^{-1}\left(1 - \dfrac{\alpha}{2} \right) \]
But what if we want to set the joint hypothesis \[ H_0: \beta_{\tau} = 0, \quad \tau = t_0, \dots, T \]
Proposition 6 (Joint Asymptotics for Estimated Effects) Let
Then there exists some variance-covariance matrix \(\bV\) such that \[ \sqrt{N}(\hat{\bbeta} -\bbeta) \Rightarrow N(0, \bV) \]
Can use Proposition 6 to create a Wald test
ff
Panel Data I: Event Studies